library(readr)
library(ggplot2)
library(plotly)
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Registered S3 methods overwritten by 'htmltools':
method from
print.html tools:rstudio
print.shiny.tag tools:rstudio
print.shiny.tag.list tools:rstudio
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
library(ggthemes)
df <- read.csv('financial_inclusion_and_aml_risk_2017.csv')
head(df)
scatter <- ggplot(df, aes(x=Financial.Inclusion, y=AML.Risk)) + geom_point(color ='blue') + geom_smooth(aes(group=1), method = 'lm', formula = y ~ log(x), se=FALSE, color='red')
scatter
NA
NA
pointsToLabel <- c("Russia", "Venezuela", "Iraq", "Myanmar", "Sudan", "Afghanistan", "Iran", "Argentina", "Brazil", "India", "Burkina Faso", "China", "South Africa", "Spain", "Botswana", "Rwanda", "Bolivia", "United States", "Honduras", "Cambodia", "Azerbaijan", "Japan", "New Zealand", "Singapore")
scatter <- scatter + geom_text(aes(label = Country), color = "black", fontface='bold', data=subset(df, Country %in% pointsToLabel), check_overlap = TRUE) + scale_x_continuous(name="Financial Inclusion", limits = c(0, 105))
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale.
scatter
scatter <- scatter + scale_y_continuous(name="AML Risk", limits = c(0, 10))
Scale for 'y' is already present. Adding another scale for 'y', which will replace the existing scale.
scatter
scatter <- scatter + ggtitle('Financial Inclusion and AML Risk') + theme_economist()
library(plotly)
ggplotly(scatter)
ggsave('scatter.png', height = 7 , width = 10)
The hypothesis was: Financial inclusion and money laundering risk are negatively correlated. That is, the more financial inclusion that a country has, the lower will be its money laundering risk.
Therefore, the null hypothesis was: The correlation between financial inclusion and money laundering risk is equal to or greater than zero.
I will use the standard level of significance - 0.05.
Based on our graph, it certainly looks as though there is a negative correlation.
cor.test(df$Financial.Inclusion, df$AML.Risk)
Pearson's product-moment correlation
data: df$Financial.Inclusion and df$AML.Risk
t = -9.115, df = 124, p-value = 1.733e-15
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.7276953 -0.5156258
sample estimates:
cor
-0.6334079
corr <- -0.6334079
p <- format(1.733e-15, scientific=F)
library(stringr)
str_glue("The correlation coefficient is {corr}")
The correlation coefficient is -0.6334079
str_glue("The P value is {p}")
The P value is 0.000000000000001733
Since the P value is less than 0.05, we can reject the null hypothesis.